记录一下
一次应用事故分析、排查、处理
背景介绍
9号上午收到CPU告警,同时业务反馈依赖该服务的上游服务接口响应耗时太长
1 2 3 4 5 6 7 8
| 应用告警-CPU使用率 告警变更 【WARNING】项目XXX,集群qd-aliyun,分区bbbb-prod,应用customer,实例customer-6fb6448688-m47jz, POD实例CPU请求使用率 >= 90.000000% 当前值138.4971051199925% 发生时间:2024/10/09 11:17:33
项目XXX,集群qd-aliyun,分区bbbb-prod,应用customer,实例customer-6fb6448688-28pvs, POD实例CPU请求使用率 >= 90.000000% 当前值157.7076205766934%告警已恢复 发生时间: 2024/10/09 11:06:33 恢复时间: 2024/10/09 12:24:33
|
服务访问量
单实例峰值QPS100左右
为啥要关注QPS,因为QPS100不应该消耗这么多CPU啊,而且请求、响应体都不大。
POD监控
POD配额
- CPU请求 2 Core CPU上限 3 Core
- 内存请求 7GiB 内存上限 9GiB
从图中可以看出
- CPU负载一直很高
- TCP链接及线程数从11点40开始陡峭上升
Arms
看下Trace监控发现,耗时主要是customer通过fegin调用外围接口导致的。
临时方案
临时处理方案:扩实例并增加CPU配置。
根因分析
此处略过排查三方接口跟开放平台网关的过程,此处的结论是:依赖的三方接口跟开放平台网关没有问题。
为啥会先排查三方接口跟开放平台网关是因为中Trace上来看是调用三方接口响应时间过长。
从Arms图看可以看出
- CPU耗时集中在fegin调用的Decoder、Encoder
- Decoder、Encoder耗时都集中在
- HttpMessageConverters#getDefaultConverters()=>
- WebMvcConfigurationSupport#addDefaultHttpMessageConverters=>
- …(具体调用链看下方摘要)
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19
| feign.ReflectiveFeign$BuildTemplateByResolvingArgs.create(Object[]) (14.37%, 1.43 minutes) feign.ReflectiveFeign$BuildEncodedTemplateFromArgs.reesolve(Object[], RequestTemplate, Map) (14.37%, 1.43minutes) org.springframework.cloud.openfeign.support.SpringEndcoder.encode(Object, Type, RequestTemplate) (14.28%,1.42 minutes) com.jiankunking.common.core.feign.FeignClientsConfig$$ambda$938.56729293.get0bject() (13.98%, 1.39 minutes com.jiankunking.common.core.feign.FeignClientsConfig.lambda$feignEncoder$2() (13.98%, 1.39 minutees) org.springframework.boot.autoconfigure.http.HttpmessaageConverters.<init>(HttpMessageConverter[]) (12.03%,1.19 minutes) prg.springframework.boot.autoconfigure.http.Http.HttpMessageConverters.<init>(Collection) (12.03%, 119 minutes) org.springframework.boot.autoconfigure.http.HttpmessaageConverters.<init>(boolean, Collection) (12.03%, 1.19 minutes) prg.springframework.boot.autoconfigure.http.Http.HttpMessageConverters.getDefaultConverters()(12.02%, 1.19 minutes org.springframework.boot.autoconfigure.http.HttpmessageConverters$1.defaultMessageConverters() (12.02%, 119 minutes) org.springframework.web.servlet.config.annotation.WebMvcConfigurationSupport.getMessageConverters() (12.02%, 1.19 minutes) org.springframework.web.servlet.config.annotation. WebMvcConfigurationSupport.addDefaultHttpMessageConverters(List) (12.02%, 1 org.springframework.http.converter.json.Jackson2ObjectMapperBuilder.build() (5.93%, 0.59 minutes) org.springframework.http.converter.json.Jackson2ObjectMapperBuilder.configure(ObjectMapper)(5.91%, 0.59 minutes) org.springframework.http.converter.json.Jackson2Objec:tMapperBuilder.registerWellKnownModulesIfAvailable(Map)(5.89%,0.58 min org.springframework.util.ClassUtils.forName(String, CClassLoader)(5.84%, 0.58 minutes) java.lang.Class.forName(String, boolean, Classloader) (5.83%, 0.58 minutes) java.lang.Class.forName0(String, boolean, ClassLoader, Class) (5.83%, 0.58 minutes) ......
|
自定义Encoder、Decoder
Encoder
看下jiankunking.common.core.feign.FeignClientsConfig中的Encoder
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22
| public Encoder feignEncoder() { ObjectFactory<HttpMessageConverters> objectFactory = () -> new HttpMessageConverters(new RMappingJackson2HttpMessageConverter()); return new SpringEncoder(objectFactory); }
public class RMappingJackson2HttpMessageConverter extends MappingJackson2HttpMessageConverter {
public RMappingJackson2HttpMessageConverter(ObjectMapper objectMapper) { super(objectMapper); List<MediaType> mediaTypes = new ArrayList<>(); mediaTypes.add(MediaType.valueOf(MediaType.APPLICATION_JSON_UTF8_VALUE)); mediaTypes.add(MediaType.valueOf(MediaType.TEXT_HTML_VALUE + ";charset=UTF-8")); setSupportedMediaTypes(mediaTypes); }
RMappingJackson2HttpMessageConverter() { List<MediaType> mediaTypes = new ArrayList<>(); mediaTypes.add(MediaType.valueOf(MediaType.APPLICATION_JSON_UTF8_VALUE)); mediaTypes.add(MediaType.valueOf(MediaType.TEXT_HTML_VALUE + ";charset=UTF-8")); setSupportedMediaTypes(mediaTypes); } }
|
Decoder
看下jiankunking.common.core.feign.FeignClientsConfig中的Decoder
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18
| public Decoder feignDecoder() { HttpMessageConverter jacksonConverter = new MappingJackson2HttpMessageConverter(customObjectMapper()); ObjectFactory<HttpMessageConverters> objectFactory = () -> new HttpMessageConverters(jacksonConverter); return new ResponseEntityDecoder(new RSpringDecoder(objectFactory)); }
public ObjectMapper customObjectMapper() { ObjectMapper objectMapper = new ObjectMapper();
objectMapper.registerModule(new StringToDateModule()); objectMapper.configure(JsonParser.Feature.ALLOW_COMMENTS, true); objectMapper.configure(JsonParser.Feature.ALLOW_UNQUOTED_FIELD_NAMES, true); objectMapper.configure(JsonParser.Feature.ALLOW_SINGLE_QUOTES, true); objectMapper.configure(JsonParser.Feature.ALLOW_UNQUOTED_CONTROL_CHARS, true); objectMapper.configure(DeserializationFeature.FAIL_ON_UNKNOWN_PROPERTIES, false);
return objectMapper; }
|
Google了一下:‘spring feign encode jackson cpu usage high’
=> https://segmentfault.com/a/1190000043037032
=> https://mp.weixin.qq.com/s/RuqltkN9VdVQ1K3GKuJ-Gw
=> https://meantobe.github.io/2019/12/21/ClassLoader/
源码分析
查看registerWellKnownModulesIfAvailable处的代码
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53
| @SuppressWarnings("unchecked") private void registerWellKnownModulesIfAvailable(Map<Object, Module> modulesToRegister) { try { Class<? extends Module> jdk8ModuleClass = (Class<? extends Module>) ClassUtils.forName("com.fasterxml.jackson.datatype.jdk8.Jdk8Module", this.moduleClassLoader); Module jdk8Module = BeanUtils.instantiateClass(jdk8ModuleClass); modulesToRegister.put(jdk8Module.getTypeId(), jdk8Module); } catch (ClassNotFoundException ex) { // jackson-datatype-jdk8 not available }
try { Class<? extends Module> javaTimeModuleClass = (Class<? extends Module>) ClassUtils.forName("com.fasterxml.jackson.datatype.jsr310.JavaTimeModule", this.moduleClassLoader); Module javaTimeModule = BeanUtils.instantiateClass(javaTimeModuleClass); modulesToRegister.put(javaTimeModule.getTypeId(), javaTimeModule); } catch (ClassNotFoundException ex) { // jackson-datatype-jsr310 not available }
// Joda-Time present? if (ClassUtils.isPresent("org.joda.time.LocalDate", this.moduleClassLoader)) { try { Class<? extends Module> jodaModuleClass = (Class<? extends Module>) ClassUtils.forName("com.fasterxml.jackson.datatype.joda.JodaModule", this.moduleClassLoader); Module jodaModule = BeanUtils.instantiateClass(jodaModuleClass); modulesToRegister.put(jodaModule.getTypeId(), jodaModule); } catch (ClassNotFoundException ex) { // jackson-datatype-joda not available } }
// Kotlin present? if (KotlinDetector.isKotlinPresent()) { try { Class<? extends Module> kotlinModuleClass = (Class<? extends Module>) ClassUtils.forName("com.fasterxml.jackson.module.kotlin.KotlinModule", this.moduleClassLoader); Module kotlinModule = BeanUtils.instantiateClass(kotlinModuleClass); modulesToRegister.put(kotlinModule.getTypeId(), kotlinModule); } catch (ClassNotFoundException ex) { if (!kotlinWarningLogged) { kotlinWarningLogged = true; logger.warn("For Jackson Kotlin classes support please add " + "\"com.fasterxml.jackson.module:jackson-module-kotlin\" to the classpath"); } } } }
|
可以看到其逻辑为若classpath中有JodaTime的LocalDate,则加载Jackson对应的JodaModule.LaunchedURLClassLoader.
为啥没有怀疑jdk8ModuleClass、javaTimeModuleClass这两个地方呢?因为common包中已经依赖了下面两个包
1 2
| compile "com.fasterxml.jackson.datatype:jackson-datatype-jdk8:${v.jacksonDatatype}" compile "com.fasterxml.jackson.datatype:jackson-datatype-jsr310:${v.jacksonDatatype}"
|
那么解决方案就很清晰了
解决方案
避免ClassLoader反复加载
将这个依赖添加到工程中。加载一次后,再次调用可以通过findLoadedClass获得,减少加载类导致的资源消耗。
1 2 3 4 5
| <dependency> <groupId>com.fasterxml.jackson.datatype</groupId> <artifactId>jackson-datatype-joda</artifactId> <version>x.x.x</version> </dependency>
|
避免HttpMessageConverters重复初始化
1 2 3 4 5 6 7 8 9 10 11 12
| public Decoder feignDecoder() { HttpMessageConverter jacksonConverter = new MappingJackson2HttpMessageConverter(customObjectMapper()); ObjectFactory<HttpMessageConverters> objectFactory = () -> new HttpMessageConverters(false, Collections.singletonList(jacksonConverter)); return new ResponseEntityDecoder(new RSpringDecoder(objectFactory)); }
public Encoder feignEncoder() { HttpMessageConverter jacksonConverter = new RMappingJackson2HttpMessageConverter(customObjectMapper()); ObjectFactory<HttpMessageConverters> objectFactory = () -> new HttpMessageConverters(false, Collections.singletonList(jacksonConverter)); return new SpringEncoder(objectFactory); }
|
优化后效果
优化后CPU占用降低一半
总结
大家在自定义 Feign 的编解码器时,如果用到了 SpringEncoder / SpringDecoder,应避免 HttpMessageConverters 的重复初始化。如果不需要使用那些默认的 HttpMessageConverter,可以在初始化 HttpMessageConverters 时将第一个入参设置为 false,从而不初始化那些默认的 HttpMessageConverter。